---
title: R- Where the semantic publishing rubber meets the scholarly practice road
enableToc: false
creation date: $=dv.current().file.ctime
last modified date: $=dv.current().file.mtime
author: Joel Chan, Xin Qian, Katrina Fenlon, Wayne Lutters
year: 2020
reference: "https://github.com/sig-cm/JCDL-2020/blob/master/JCDL_Where_the_rubber_meets_the_road_2020-6-28-FINAL.pdf"
tags:
- resource
status: 
alias: 
---

- Context: 
    - Started [June 12th, 2020]
    - In response to this meeting idea: More interesting: consider how these elements are instantiated in actual scholarly practice: what challenges emerge? 
    - Based on last year, ~3k words (~ 6 pages)
    - https://sig-cm.github.io/news/JCDL-2020-CFP/
- External docs
    - Google Doc: https://docs.google.com/document/d/1I59bDkF7bJdxXgPg6IQ4BveMbGcalj_mHP4UOqY7E0A/edit#
    - Overleaf: https://www.overleaf.com/project/5ef6e0206667020001141441
    - Talk 
        - Outline/script
            - Motivation
                - Synthesis is hard (this is the ultimate problem we care about)
                - Big part of problem is [[Z: Most scholarly communication infrastructure operates on the document as the base unit]]
                    - See [Questions we really want to ask for synthesis]
                - Semantic digital libraries could really help!
                - We have warehouses
                - But they're kind of empty
                    - [[Z- The central bottleneck to synthesis infrastructures is authoring]]
                - Clearly it's not the case that "if we build it, they will come". How then might  we fill them?
            - Problem
                - What we've tried (that hasn't been enough):
                    - [specialized [curator] model of semantic publishing] 
                        - accurate but has serious sustainability issues
                    - [text mining model of semantic publishing]
                        - cheap but inaccurate (horizon still quite far off)
                            - though in conjunction with human labor might be good
                - We're exploring the [scholar-powered model of semantic publishing]
                - In this talk our goal is to show you a new/untapped opportunity for a kind of [scholar-powered model of semantic publishing] - specifically, we're going to show
            - Findings
                - 
            - Discussion points:
                - What, if any, existing [standards] and conceptual models could be suitable for gluing these individual practices to a collaborative context in a way that enables [interoperability]?
                    - cc [[Z- Distinction between neat vs scruffy in Semantic Web engineering]] - this is more towards the "scruffy" end of things
                - Challenge of bridging the personal and the global
                    - [[idea: local semantic publishing]]?
        - Slides:
            -  https://docs.google.com/presentation/d/e/2PACX-1vRr1PZxly5xj7LLiSBQpnPwcZV8VRejhFFuOcXCCX2-vbVwL0iv4QQ_wykWaj8l8aO--5NE6oGMJ_Qf/embed?start=false&loop=false&delayms=3000
    - Submitted manuscript
        - https://drive.google.com/file/d/15BpvVC9s0SLHX0ZPYTVKgjVTmBgUk2kp/preview
    - citation:
        - [[@chanWhereRubberMeets2020]]
- Examples / data (for results)
    - # [[Virtuosos]]
        - P2 from [[John Thesis]] [[@morabitoManagingContextScholarly2021]]: multiple highlight colors - green = "context", blue = "quote/idea":
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FJqNuJfGznt.png?alt=media&token=90298b35-9650-47f5-8f73-4ec067e99fa0)
        - P3 from [[John Thesis]]: rich summary of papers with many details that are useful for the current context, but maybe not as useful for others
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2F-zPx7Ej5u7.png?alt=media&token=216e40ad-5b52-4044-b1b8-88555faa6c1a)
        - WW from #[[@qianOpeningBlackBox2020]]: summaries of papers, with many links to related ideas as context, snapshots of key figures / details from text
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2Fse1CgbdDAy.png?alt=media&token=bab3aebb-71f4-4894-8253-a0a85984a20a)
        - NB from #[[@qianOpeningBlackBox2020]]: structured summaries for papers / books to be read in detail. Includes summaries of "main points", but also important details and evidence, and notes about what aligns or conflicts
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FQAx1lK7E8B.png?alt=media&token=e01c4095-a358-4dc9-b808-ce1daddf31e7)
    - # [[Explorers]]
        - P5 from [[John Thesis]] extract segments and relate to them in "typed" ways (but not formally) on a canvas, such as "perspectives __from...__"
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FA7e4yFza6r.png?alt=media&token=715da9a7-eb3e-4cee-86c2-7c5a2e266328)
            - Note: each of these excerpts have rich mechanisms for [[context]], e.g., connecting to other pieces (because they are "disembedded"), "transclude" in new contexts, in addition to auto signals to name of pdf, page number, and quick jump back to original location of excerpt. Same with the [[QDAS]] route.
                - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FljDMKgyu30.png?alt=media&token=242cb4a4-6371-4660-bbaf-159c0edb3dde)
        - P1 from [[John Thesis]]: "code" excerpts from papers, place in code tree
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FZc0m-4oFu1.png?alt=media&token=5bdb5de0-112d-422e-902b-a2a4fd3fd115)
        - see list of Relevant communities around [[hypertext notebooks]]
        - #[[@omiHowTakeSmart2020]]
    - # [[Hackers]]
        - [List] Examples of lead users building their own thing]
        - see list of Relevant communities around [hypertext notebooks]
        - [[sys/Zettelkasten]] with [[sys/org-mode]] https://blog.jethro.dev/posts/zettelkasten_with_org/
        - [Digital Gardeners Telegram group]
        - Examples of [Digital Garden]s from [[P- Maggie Appleton]]
            - https://twitter.com/Mappletons/status/1250532315459194880
        - One paradigmatic example of the hacker persona is [Andy's working notes] - [[P- Andy Matuschak]]
        - Another paradigmatic example is [Jethro Kuan], who developed [[Org-roam]] to be able to implement [[Zettelkasten]] practices in [[Org-mode]].
            - rationale described in #[[@kuanHowTakeSmart2020]]
                - basically motivated by wanting to do [[sys/Zettelkasten]] (as described in #[[R- How to Take Smart Notes]]) on top of [[Org-mode]], but finding that the vanilla version wasn't enough
        - Another good example of the hacker persona is [Ian Jones]'s [Digital Garden]. 
            - Ian isn't an actively practicing scientist: he's primarily a software developer by trade, but he works with [[egghead.io]] (with [[P- Maggie Appleton]]) on creating effective programming-related explainers and tutorials. So he is essentially a learning scientist in practice. 
            - He is also an [autodidact], committed to constantly learning, for example, [how to teach a beginner topic](https://www.ianjones.us/2020-02-10-how-to-teach-a-beginner-topic). He is very concerned with creating a "[second brain]", and tries to write to learn to [create grist and reference material for courses or blog posts](https://www.ianjones.us/notes).
            - Initially he [used](https://www.ianjones.us/2019-12-19-how-we-use-notions) [[sys/Notion]] (in part through influence of being in [Building a Second Brain (BASB)] cohort), then [moved to](https://www.ianjones.us/2019-12-19-roam-research) [[Roam Research]], but recently [decided to move](https://www.ianjones.us/2020-05-05-doom-emacs) to [[Org-Roam]] so he can "own" his [second brain].
            - A key feature that he tries to accomplish is [composability] through linking, a la [[sys/Zettelkasten]]. This method of composability partially relies on effective [compression] of ideas into relatively atomic notes about key ideas, that he can then densely link with others. This partially achieves [contextualizability] as well.
            - A key innovation in [[Org-Roam]] was the inclusion of automatic [[bi-directional links]]. Other hacker-types like [[P- Andy Matuschak]] have written scripts to do this in a more plain-text, [[Markdown]] setting like [[Bear]]
                - [[P- Andy Matuschak]] wrote a script [[sys/note-link-janitor]] to enable [[bi-directional links]] on top of [[sys/BearNotes]], which he chooses to use because it's offline, and he also wants to experiment with building [Tools for Thought]
            - What I want to highlight, though, is a hack that he made to provide richer [context]ualizability] links back to original sources, using [[sys/org-mode]] extension that can connect specific excerpts from PDFs to notes - each note then becomes a "pointer" back to a specific segment in the PDF. 
                - In his words, described in [this post](https://www.ianjones.us/org-roam-bibtex#orgb716c96) #[[@jonesOrgRoamBibtex2020]]
                    - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FjsujVrXcM2.png?alt=media&token=006eb7d9-e5a6-46f6-921d-ebdc3ee64980)
        - Another example of a hacker persona (who may not personally have deep development knowledge, but can team up well with those who do) is [[P- Anne-Laure Le Cunff]]'s [Digital Garden]
            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FRSQ8y5CSi6.png?alt=media&token=0ef49845-45b1-461f-aeac-7b2d294d82a5)
            - Built on top of [[sys/TiddlyWiki]], and inspired by [[P- Andy Matuschak]]'s approach, she also employs [[compression]] to enable richer [[[[context]]ualizability]] through [[bi-directional links]]. This particular instance, though, has less connection to primary/original sources, which is fairly typical of many of the [[Digital Garden]]s at play right now.
            - [[P- Anne-Laure Le Cunff]] writes on Slack that most of what's needed is already in place, free and open-source: "[[sys/TiddlyWiki]]+ [[sys/TiddlyBlink]] (for [[bi-directional links]]) + Transclude Popup Plugin for the previews (https://giffmex.org/gifts/transclusioninpopups.html) + export everything as a static website (tutorial here: https://nesslabs.com/tiddlywiki-static-website-generator) and host it on GitHub pages for free."
        - [[P- Stian Håklev]] is another example: back in his PhD, he created a personal wiki to enable tighter integration between his ideas and his sources (strongly motivated by desire for [[[[context]]ualizability]])
            - called it [[sys/researchr]], and was heavily based on a personal [[sys/Wiki]] approach, with some integrations with bibtex and PDFs. described in this video #[[@haklevUsingResearchrTagextract2012]]
            - later migrated to [[Roam Research]] and is now core member of community. 
                - video on youtube describing the transition ([[@haklevReproducingPhDReading2019]]) was key to me getting interested enough in [[Roam Research]] to give it a real go
        - [[sys/vim-wiki]]
        - [[sys/GlamorousToolkit]] - not purpose-built for [[synthesis]], but very easy to do it in here, with huge advantages for [[Multiplicity]] due to its superior programmability
        - Values
            - openness, "working with the garage door up" / [[learning in public]]
            - privacy, [[sustainability]] (not just the "next hot tool"), being "future proof" (is a big reason to work in plain text, in [[sys/emacs]] or [[std/Markdown]])
        - Other notes
            - Many are not scholars, but some are! Especially as the line between explorers and hackers blurs with the advent of more "plugin-friendly" systems like [[sys/Obsidian]] and [[Roam Research]] (in the latter, you can now write simple javascript to augment the experience; have personal experience of a law professor writing some simple javascript to visually, with some minimal semantics, distinguish source excerpts from his own thoughts)
                - see Trick: use embedded JS to modify the DOM based on tags to make visual distinctions between my thoughts and others' thoughts.  from [[Roaman lit-reviewing meetup June 5th, 2020]]
                - Can see these as instances of [[end-user programming]], probably not too far from the [[zone of proximal development]] for many of these researchers, especially as the lines between "technical/programming" work (e.g., move from SPSS to [[sys/R]]) become blurred
                    - the [[ProgrammingHistorian]] is another example of this, as is [[Kieran Heasly]]
                - We now also have [[sys/FoamBubble]]!!
            - As norms move towards more [[open science]], [[open data]], digital-based workflows, and so on, there is significant opportunity to integrate with these favorable trends to explore how [[semantic publishing]] standards might be better integrated into [[scholarly workflows]]
            - Hackers are important because they can create a more open ecosystem of tools (cf. [[sys/Athens]], [[sys/FoamBubble]], [[sys/Obsidian]]) that are easier to try. This broadens the base beyond those who are tied to a particular platform or price point.
    - New tools we can integrate with
        - https://twitter.com/MuseAppHQ/status/1273698452539609088
- Comments/notes
    - Important to be clear about what precisely we mean by [[semantic publishing]]: uptake of __what__ by __whom__, __where__
        - Referring to: Yet, on the whole, we're not seeing nearly as much of this transformation as we'd like. Uptake of [[semantic publishing]] is low and restricted to a small set of power users
        - What: not just linked entities, but the whole shebang: [[Atomicity]], [[context]], [[composability]], [[Multiplicity]]. IN other words, [[semantic publishing]] for [[synthesis]]! 
            - Not unique to us! see, e.g., #[[R- Genuine semantic publishing]], #[[@grozaSALTWeavingClaim2007]]
            - Contra the slightly broader definition by #[[@shottonSemanticPublishingComing2009]]
                - "In the present context, I define '[[semantic publishing]]' as ^^anything that enhances the meaning of a published journal article, facilitates its automated discovery, enables its linking to semantically related articles, provides access to data within the article in actionable form, or facilitates integration of data between papers^^. Among other things, it involves enriching the article with appropriate metadata that are amenable to automated processing and analysis, allowing enhanced verifiability of published information and providing the capacity for automated discovery and summarization. These semantic enhancements increase the intrinsic value of journal articles, by increasing the ease by which information, understanding and knowledge can be extracted. They also enable the development of secondary services that can integrate information between such enhanced articles, providing additional business opportunities for the publishers involved. Equally importantly, readers benefit from more rapid, more convenient and more complete access to reliable information" 
        - [[Katrina Fenlon]]'s review can help us provide specific, concrete examples, including things like [[std/Nanopublications]]: https://docs.google.com/document/d/1uI30nLzcca7gwMZ-MGXbnedk_NXFUvqVrcIpRssRz-I/edit
    - Old/bad outline
        - #[[Joel Chan]] 16:05 This doesn't feel right.
        - Synthesis is a thing.
        - It's hard.
        - We're trying to build tools to help.
        - Historically the focus of your work on data modeling has been on the practice of a community, or in the practice of specialized curators. Here we're focusing on what that might look like in the practice of an individual scholar.
        - As we studied and built stuff, we started to notice a pattern of needs emerging, in terms of "what's missing": these 3/4 things (Compression, Context, Composability, etc.)
        - This pattern of needs is quite similar to the requirements/motivations of a BUNCH of data models that have been developed already by your community, for alternative infrastructures for synthesis.
    - Relevant stuff
        - Contribution frames
            - #[[Authoring Bottleneck]]
            - #[[Z- The central bottleneck to synthesis infrastructures is authoring]]
                - #[[Z: Models of authoring for semantic publishing in scholarly communication infrastructures]]
            - #[[Q- Do scholarly synthesis infrastructures already exist]]
            - Discussing "where we are" in the [[infrastructure]] (note that standards are probably ok, tooling is not) is helpful, and thinking through the challenges in the other parts of the infrastructure will yield generative conversations
            - Renewed charge to build a semantic knowledge layer, dating back to [[Semantic Web]], [[std/Hypertext]], etc., but refined with recent advances in [[infrastructure]] studies and [[reuse]], and experiences with trying to build these layers, including the thread on [[standard/Claims]] within HCI
            - important to think through more carefully why [[sys/Xanadu]] and the many [[Semantic Web]] and [[std/Hypertext]] or [[sys/Memex]] related projects have “failed”.
            - More interesting: subtleties, nuances of the elements of the framework
        - What do we already know (lit-wise) about the rubber meeting the road?
            - Adding to thread on [[Authoring Bottleneck]]: [[Z: What is the user experience of dedicated semantic authoring]] and [[Q: What is the user experience of semantic authoring within regular scholarly workflows]]
            - #[[@kuhnBroadeningScopeNanopublications2013]] has a user study that checks how easy it is to train 16 biomed researchers to convert a short text into a natural language statement (no [[formality]] though!)
        - POtential new data sources for our observations / anecdotes to add the "road" to the rubber
            - Found old thread of video recordings on my Youtube channel taht really nicely track the evolution (and constancy) of our ideas around #[[D/Synthesis Infrastructure]]!
            - [[C- Effective individual synthesis systems (seem to mostly) exist (for a select few)]]
        - Analogous threads
            - Reading again the [[CEDAR project]]: motivation and lessons are similar for [[open data]]: everyone knows science is better for it if data are shared with appropriate metadata, following [[FAIR principles]]. But uptake is low, except when there are extreme incentives. Some stuff I saw at [[iConference]] last year along this vein. See also [[@tenopirDataSharingScientists2011]]
        - Other stuff
            - #[[@wolfCurseXanadu1995]]
            - #[[Z: Most scholarly communication infrastructure operates on the document as the base unit]]
    - ➰ breadcrumbs
        For "data collection" (FOCUS ON THIS FIRST)
            - {{[[DONE]]}} Begin compiling a "results" section. 
                - For this: So far, we're seeing that a surprising amount of that labor is already happening! Completely of their own volition, away from any data models people, real scholars are trying to shape their workflows / setups to satisfy requirements of compression / context / composability / multiplicity.
                - Maybe start with something high level like 
                    - "How are scholars doing "semantic publishing"-like labor in their synthesis practice?" #[[Z]]
        For motivation / framing
            - {{[[DONE]]}} Read and zettel a few seed papers on "uptake" of specifically "semantic publishing" for scholarly work
                - Seed is [[@kuhnNanopublicationsGrowingResource2018]] reports around 10 million [[std/Nanopublications]] published at the time of writing, albeit almost all within bioinformatics, and overwhelmingly dominated by a small (N=41!!) set of authors
            - {{[[DONE]]}} Start a thread collecting state of the art on text mining for semantic publishing
                - Seed is #[[@kilicogluBiomedicalTextMining2017]]
- Outline:
    - ### 1. The problem: the [[Authoring Bottleneck]] for "genuine" [[semantic publishing]] for [[[[scholarly communication]] [[infrastructure]]]]
        - We believe (as do you!) that [[semantic publishing]] could transform [[scholarly communication]] ([[@berners-leePublishingSemanticWeb2001]]) 
            - We resonate a lot with the vision cast by #[[@renearStrategicReadingOntologies2009]]: "Scientists will still read narrative prose, even as text mining and automated processing become common; however, these reading practices will become increasingly strategic, supported by enhanced literature and ontology-aware tools. As part of the publishing workflow, scientific terminology will be indexed routinely against rich ontologies. More importantly, formalized assertions, perhaps maintained in specialized 'structured abstracts' (27), wil provide indexing and browsing tools with computational access to causal and ontological relationships. Hypertext linking will be extensive, generated both automatically and by readers providing commentary on blogs and through shared annotation databases. At the same time, more tools for enhanced searching, scanning and analyzing will appear and exploit the increasingly rich layer of indexing, linking, and annotation information." (p. 832)
            - [[Z: Most scholarly communication infrastructure operates on the document as the base unit]]
            - #@grozaSALTWeavingClaim2007
        - It's not that we have made no progress! Indeed, the [[semantic publishing]] revolution is indeed underway! We see encouraging developments, in [[bioinformatics]] and [[archeology]], for example. 
        - Yet, on the whole, we're not seeing nearly as much of this transformation as we'd like. Uptake of [[semantic publishing]] is low and restricted to a small set of power users
        - Discussion of what it will take to fully realize this transformation is robust and ongoing in this community. One important part of the conversation is a sense that  [[Z- The central bottleneck to synthesis infrastructures is authoring]]
            - For example, [[R- Genuine semantic publishing]] notes, "It turns out that all the technologies needed for applying genuine semantic publishing are already available and most of them are very mature and reliable. There are no __technical__ obstacles preventing us from releasing our results from today on as genuine semantic publications, even though more work is needed on ontologies that cover all relevant aspects and areas and on nice and intuitive end-user interfaces to make this process as easy as possible." (p. 148)
    - ### 2. The general solution we're excited about: explore scholar-powered contribution models of [[semantic publishing]]
        - The [[specialized [[curator]] model of semantic publishing]] is currently the engine of what uptake exists. But it requires a lot of funding, for a really long time! It's a great fit for well-funded domains like biomedical sciences, but is a much harder sell for many other domains of knowledge with less obvious funding implications, like the humanities and social sciences. And it's hard to predict in advance precisely which fields are going to be "worth funding": knowledge doesn't work that way!
        - There is an exciting thread of conversation that is exploring how to address the [[Authoring Bottleneck]] with a more "scholar-powered" model of contribution. For example, many propose an [[author contribution model of semantic publishing]], in tandem with a [[text mining model of semantic publishing]] (#[[@monsWhichGeneDid2005]], #[[@grozaSALTWeavingClaim2007]], #[[@berners-leePublishingSemanticWeb2001]]). 
        - We think this is a promising approach! There is potential alignment of incentives, with [[semantic publishing]] providing high potential payoffs in visibility and impact of work. But we're unsure
        - Exemplars of alternative approaches
            - Integrate with [[peer review]] #[[@bucurPeerReviewingRevisited2019]]
        - In our work, ^^we are exploring the potential of addressing the authoring bottleneck by integrating into the work that scholars are already doing^^, as a replacement/supplement to text mining and specialized labor.
    - ### 3. Our intended contribution to this conversation: investigate integration points in existing [[scholarly workflows]] for [[synthesis]]
        - In this paper, we want to share some preliminary findings for a foundational question: where (if at all) are there integration points in **__existing synthesis practice__** with the labor required to power synthesis infrastructures? In other words, we're investigating: ^^what (if any) [[semantic publishing]] work is already being done by scholars in their [[scholarly workflows]], not just in the authoring/publishing process^^? [[Q- What semantic publishing work is already being done by scholars in their scholarly workflows?]]
        - Revealing these "integration points" would help us see where we might be able to "accrete" semantic authoring tools to leverage the rich semantic work that is already happening. 
        - Investigating these integration points could also help us understand how to better design "intuitive user interfaces" for [[authoring tools]]: beyond "usability", we could improve [[sustainability]] by improving, rather than disrupting, existing [[scholarly workflows]]
        - At the same time, we could help address incentive mismatches if these integration points allow us to also significantly augment [[individual synthesis]] so that the labor of [[semantic publishing]] yields more immediate benefits.
        - So far, we're seeing that a surprising amount of that labor is already happening! Completely of their own volition, away from any data models people, real scholars are trying to shape their workflows / setups to satisfy requirements of compression / context / composability / multiplicity.
    - ### 4. Interlude: what do we mean by [[semantic publishing]] labor?
        - Here is where talk about the framework of [[compression]], [[context]], [[composability]], and (maybe) [[Multiplicity]]. It's our frame for looking.
    - ### 5. A taste of our findings
        - By dimension
            - [[compression]] / [[Scholars are (already) doing compression work]]
            - [[context]] / [[Scholars are (already) doing contextualization work]]
            - [[composability]] / [[Scholars are (already) doing composability work]]
        - By practice
            - ^^Semantically rich annotations^^
                - Summary
                    - Use annotations to identify "subdocuments" / ideas / building blocks for reuse [[compression]]
                    - Include contextual details as well ([[context]])
                    - Employ sophisticated mapping with colors to identify types of building blocks ([[composability]])
                    - Noted as [[in-source annotations]] in #[[@qianOpeningBlackBox2020]]
                - Examples
                    - P2 from [[John Thesis]] [[@morabitoManagingContextScholarly2021]]: multiple highlight colors - green = "context", blue = "quote/idea":
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FJqNuJfGznt.png?alt=media&token=90298b35-9650-47f5-8f73-4ec067e99fa0)
            - ^^Power tools for power context^^
                - Summary
                    - Others push past the limits of their worklfows towards less mainstream tools, or even repurposing other tools like [[QDAS]]
                    - In [[QDAS]]:
                        - Achieve [[compression]] more explicitly, creating excerpts that are manipulable by themselves, but also make sure they're typed [[composability]] and have particular semantically meaningful relationships to other compressed segments ([[composability]]). A TON of [[context]] you get for free: no need to do the tedious manual work of writing down a source name and page number. Can see this as a next-level evolution of the practice of highlighting that we've seen.
                    - In [[sys/LiquidText]]
                        - Note: this is interesting because [[sys/LiquidText]] explicitly builds on the deep work in [[active reading]]
                - Examples
                    - P1 from [[John Thesis]]: "code" excerpts from papers, place in code tree
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FZc0m-4oFu1.png?alt=media&token=5bdb5de0-112d-422e-902b-a2a4fd3fd115)
                    - P5 from [[John Thesis]] extract segments and relate to them in "typed" ways (but not formally) on a canvas, such as "perspectives __from...__"
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FA7e4yFza6r.png?alt=media&token=715da9a7-eb3e-4cee-86c2-7c5a2e266328)
                        - Note: each of these excerpts have rich mechanisms for [[context]], e.g., connecting to other pieces (because they are "disembedded"), "transclude" in new contexts, in addition to auto signals to name of pdf, page number, and quick jump back to original location of excerpt. Same with the [[QDAS]] route.
                            - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FljDMKgyu30.png?alt=media&token=242cb4a4-6371-4660-bbaf-159c0edb3dde)
            - ^^Rich TL;DRs^^
                - Summary
                    - Includes summaries of main claims ([[compression]]) and how they relate to each other [[composability]], as well as key details ([[context]]) to be used
                    - This recalls some practices that others also do, see, e.g., [[memorandums]], as described and popularized by [[Raul Pacheco-Vega]]
                    - Noted as [[per-paper summaries]] in #[[@qianOpeningBlackBox2020]]
                - Examples
                    - NB from #[[@qianOpeningBlackBox2020]]: structured summaries for papers / books to be read in detail. Includes summaries of "main points", but also important details and evidence, and notes about what aligns or conflicts
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2FQAx1lK7E8B.png?alt=media&token=e01c4095-a358-4dc9-b808-ce1daddf31e7)
                    - WW from #[[@qianOpeningBlackBox2020]]: summaries of papers, with many links to related ideas as context, snapshots of key figures / details from text
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2Fse1CgbdDAy.png?alt=media&token=bab3aebb-71f4-4894-8253-a0a85984a20a)
                    - P3 from [[John Thesis]]: rich summary of papers with many details that are useful for the current context, but maybe not as useful for others
                        - ![](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fmegacoglab%2F-zPx7Ej5u7.png?alt=media&token=216e40ad-5b52-4044-b1b8-88555faa6c1a)
            - ^^Richly structured shared artifacts^^
                - 
        - By persona
            - Can write synthetic vignettes, then show concrete examples
            - The personas
                - ^^The virtuosos^^ [[Virtuosos]]
                    - Where many people are
                    - Often includes substantial analog work
                - ^^The early adopters and appropriators^^ [[Explorers]]
                    -  [[sys/LiquidText]], [[Roam Research]], [[sys/TiddlyWiki]], [[sys/TinderBox]], etc.
                    - [[QDAS]], excel, etc.
                    - Let's talk about the [[hypertext notebooks]]
                        - The conceptual and technical roots of this wave can be traced to the influential ideas of [[Vannevar Bush]], [[Ted Nelson]], and [[Doug Engelbart]] around [[std/Hypertext]]. This scene has also been heavily influenced by the idea of a [[sys/Zettelkasten]] approach for [[Knowledge Management]], originated in [[Niklas Luhmann]], and later popularized in the English-speaking world by [[zettelkasten.de]], [[Sönke Ahrens]] with [[@ahrensHowTakeSmart2017]], and, most recently, beginning approximately in [[2019]], by the emergence of [[Roam Research]]. There is also substantial influence from [[@andy_matuschak]], an independent researcher in the US Bay Area, who has put forward the concept of [[sys/Evergreen Notes]], and was one of the first to start sharing his. Andy himself is better understood as one of the [[Hackers]], as we will discuss in more detail.
                        - The key practices and affordances of [[hypertext notebooks]] focus on the creation and maintenance of relatively **atomic** notes (either a concept or some kind of focused "claim") that are densely linked together. The links are typically accomplished through [[bi-directional links]], where, every time a link is made __from__ one source note to a target note, both the source and target notes record the link. In this way, links between notes are more accessible, since links can be followed from either source or target notes. 
                        - [[bi-directional links]] are a key innovation on top of the original [[sys/Wiki]] approach, because they enables more focused usage and labor around tending to connections between notes ([[composability]]). The links also enable [[compression]] by supporting refactoring of ideas into smaller pieces, knowing that you will still be able to follow threads of logic: users can compress quite complex ideas into a single statement (e.g., "knowledge is contextual") while retaining links to the less compressed ideas that "unpack" different aspects and subtleties of the more complex idea. In this way, tending to the notes and links also enhances the [[[[context]]ualizability]] of each note.
                            - [[Andy's working notes]] has concept of "layers" in a taxonomy of [[sys/Evergreen Notes]]
                            - See also:
                                - #[[@saschaThreeLayersEvidence2019]]
                                - #[[@saschaTaleComplexityStructural2018]]
                            - [[@andy_matuschak]] wrote a script [[sys/note-link-janitor]] to enable [[bi-directional links]] on top of [[sys/BearNotes]], which he chooses to use because it's offline, and he also wants to experiment with building [[Tools for Thought]]
                        - The system emphasizes separation between "your thoughts" (the main content of these networked notebooks) and "others' thoughts" (which should live in "literature notes" that are linked to, but separated from, the networked notebook) The atomicity of the notes is a way to achieve [[compression]], both obviously (by breaking things down), but also in more subtle ways, by compressing quite complex ideas into a single statement (e.g., "knowledge is contextual") while retaining links to the less compressed ideas that "unpack" the complex idea. 
                        - The atomicity and [[bi-directional links]] also emphasize the deliberate practice of [[composability]]: developing ideas into more complex and better forms over a long period of time. The phrase "finding connections" is a staple in this community.
                        - Since notes that collect [[bi-directional links]] can be as "small" as a single concept, the act of deliberately linking notes partially accomplishes the work of deliberately developing [[folksonomies]], which approximate the more formal / [[ontologies]] work of [[semantic publishing]] for [[composability]]. A key affordance in these [[hypertext notebooks]] is that it is quite easy to rename note titles --- changes to a note's title typically automateically propagate throughout the database of notes --- which enables more agile and evolving [[ontologies]]. 
                            - People in the [[Hackers]] community are actively working on ways to manage aliases and merging. 
                                - People also use [[bi-directional links]] to do [[Contextual Bootstrapping]] of ideas, creating "empty pages" or "hooks" for concepts. 
                - ^^The hackers^^ [[Hackers]]
                - Others??
                    - ^^The wanderer?^^
                        - Where many people are
                        - Moves between many things, in constant flux
        - Some high-level takeaways:
            - Many of these in isolation are not new! Lots of beautifully detailed work in [[active reading]]; see as far back as #[[@oharaStudentReadersUse1998]] for rich examples, or even... as far back as [[Charles Darwin]] and, arguably, as old as external representations ;). What we see here is a different lens with which to view these data, to consider what these behaviors (could) do.
                - The sophistication of the *system* (vs. the medium or tool) might explain [[Z: The stubborn effectiveness of analog media]]
            - Why are they doing this?
                - They *want* to do this work (better), but often don't because it ends up being too costly
                - But the social context is key for at least some of it: a lot of explicit work being done in collaborative settings
                    - Advisor meetings
                    - Large-scale collaborations
                    - Teams
            - Strong opportunity for progress by partnering modeling and standards work with growing user interface innovations (sort of the reverse of making [[semantic publishing]] user interfaces better). 
                - Examples
                    - https://twitter.com/MuseAppHQ/status/1273698452539609088
                    - See also [[sys/Hypothes.is]]
                - Different kinds of questions:
                    - *which* types of annnotations at which points might be valuable __for the user__ to formalize / contextualize in some way?
                    - how might formalisms of [[semantic publishing]] deliver immediate value to the scholar at different parts of their scholarly workflow?
                        - identify some of the frictions and pain points in their process. good clue to this is what motivates the move to [[QDAS]] and [[sys/LiquidText]] - quotes from guided tour in [[John Thesis]]?
                    - how much of this is actually happening? is there a way to study this in a way that is analogous to the [[cognitive surplus]] idea from crowdsourcing?
                        - see, e.g., What is the scale at which scientists are producing annotations and notes? In other words, what is the untapped opportunity here? What is the "drag coefficient"? How much energy is being "wasted'?